Computing decision boundary for geometric examples in dimension 2 and 3¶

In this notebook we will use gradient flow from gdeep.decision_boundary to compute the decision boundary of a neural network trained on a binary classification problem

Example 1: Decision boundary for two circles in $\mathbb R^2$¶

We have constructed two data sets $A$ and $B$ with the labels $0$ and $1$. These are either concentric or separated from one another by adding [2,0] or not.

In [2]:
data, label = datasets.make_circles(n_samples=5000, noise=0.05, factor=0.3, random_state=seed)
df=pd.DataFrame(data, columns=["x","y"])
df["label"]=label
In [3]:
px.scatter(df,x="x",y="y",color="label")

We will uniformely distribute points in the box [(-1,1),(-1,1)] and we pull push these point x_i in the opposite direction of the gradient of gradient of (Net(x_i)-0.5)^2. This will be done n_epochs times.

In [5]:
gradient_flow_db_circle = GradientFlow(circle_detect, boundary_tuple=[(-1,1),(-1,1)])
sample_points_boundary = gradient_flow_db_circle()
plot_decision_boundary(data, label, sample_points_boundary, n_components=2)

We can verify our result by looking at the contour plot of the neural net.

In [6]:
plot_activation_contours(circle_detect, delta=0.01)

Example 2: Decision boundary for two tori in $\mathbb R^3$¶

We will first generate a binary data set with two unentanbled tori in $\mathbb R^3$ and similarly for two entanbled tori.

The point clouds for the tori and the lables are stored in a dictionary.

In [6]:
torus_point_cloud = {'ent': {0: {}, 1: {}}, 'unent': {0: {}, 1: {}}}
torus_labels = {'ent': {0: {}, 1: {}}, 'unent': {0: {}, 1: {}}}

# Generate torus point cloud for unentangled tori
torus_point_cloud['unent'][0], torus_labels['unent'][0] = make_torus_point_cloud(0, 50, 0.0,\
    Rotation(1,2,math.pi/2), np.array([[0,0,0]]), radius=.3)
torus_point_cloud['unent'][1], torus_labels['unent'][1]  = make_torus_point_cloud(1, 50, 0.0,\
    Rotation(1,2,0), np.array([[6,0,0]]), radius=.3)

# Generate torus point cloud for unentangled tori
torus_point_cloud['ent'][0], torus_labels['ent'][0] = make_torus_point_cloud(0, 50, 0.0,\
    Rotation(1,2,math.pi/2), np.array([[0,0,0]]), radius=.3)
torus_point_cloud['ent'][1], torus_labels['ent'][1]  = make_torus_point_cloud(1, 50, 0.0,\
    Rotation(1,2,0), np.array([[2,0,0]]), radius=.3)


# Concatenate torus point clouds
tori_point_cloud = {}
tori_labels = {}

for config in ['ent', 'unent']:
    tori_point_cloud[config] = np.concatenate((torus_point_cloud[config][0],\
                                torus_point_cloud[config][1]), axis=0)
    tori_labels[config] = np.concatenate((torus_labels[config][0],\
                                torus_labels[config][1]), axis=0)

# Plot data sets
df_tori = {}

for config in ['ent', 'unent']:
    df_tori[config] = pd.DataFrame(tori_point_cloud[config], columns = ["x", "y", "z"])
    fig = px.scatter_3d(df_tori[config], x="x", y="y", z="z", color=tori_labels[config], title="Tori "+config+"angled")
    fig.show()

In the next step we will train fully connected neural networks on the binary classification task.

In [7]:
# Define neural network architecture
tori_detect_nn = {}
net1 = Net(0, [3,20,20,20,20])
net2 = Net(0, [3,20,20,20,10])
tori_detect_nn['unent'] = net1
tori_detect_nn['ent'] = net2

# Print the architecture of both neural nets
for config in ['ent', 'unent']:
    print('Architecture of Neural Net for ' + config + 'angled:\n', tori_detect_nn[config])

# Train neural neural nets on data sets
for config in ['ent', 'unent']:
    print('Training of Neural Net for ' + config + 'angled')
    train_classification_nn(tori_detect_nn[config], tori_point_cloud[config], tori_labels[config], n_epochs=10)
Architecture of Neural Net for entangled:
 Net(
  (layer0): Linear(in_features=3, out_features=20, bias=True)
  (layer1): Linear(in_features=20, out_features=20, bias=True)
  (layer2): Linear(in_features=20, out_features=20, bias=True)
  (layer3): Linear(in_features=20, out_features=10, bias=True)
  (layer4): Linear(in_features=10, out_features=2, bias=True)
)
Architecture of Neural Net for unentangled:
 Net(
  (layer0): Linear(in_features=3, out_features=20, bias=True)
  (layer1): Linear(in_features=20, out_features=20, bias=True)
  (layer2): Linear(in_features=20, out_features=20, bias=True)
  (layer3): Linear(in_features=20, out_features=20, bias=True)
  (layer4): Linear(in_features=20, out_features=2, bias=True)
)
Training of Neural Net for entangled
epoch train_loss valid_loss accuracy time
0 0.805826 0.823263 0.490000 00:00
1 0.649458 0.495566 0.817000 00:00
2 0.438577 0.372436 0.943000 00:00
3 0.332907 0.316930 1.000000 00:00
4 0.317086 0.314821 1.000000 00:00
5 0.314761 0.314136 1.000000 00:00
6 0.314128 0.313842 1.000000 00:00
7 0.313848 0.313700 1.000000 00:00
8 0.313727 0.313643 1.000000 00:01
9 0.313690 0.313631 1.000000 00:01
Training of Neural Net for unentangled
epoch train_loss valid_loss accuracy time
0 0.764844 0.740010 0.575000 00:01
1 0.410327 0.321139 1.000000 00:01
2 0.322438 0.313940 1.000000 00:00
3 0.314233 0.313453 1.000000 00:00
4 0.313441 0.313357 1.000000 00:00
5 0.313330 0.313317 1.000000 00:00
6 0.313300 0.313297 1.000000 00:01
7 0.313288 0.313288 1.000000 00:00
8 0.313283 0.313284 1.000000 00:00
9 0.313281 0.313284 1.000000 00:01

As in the first example we will sample points in a box and let them flow in the direction of the decison boundary using our gradient flow method.

In [8]:
# Apply gradient flow to detect decision boundary
n_samples = 10000

boundary_tuple = {}

boundary_tuple['ent']   = [(-2, 4), (-2, 2), (-2, 2)]
boundary_tuple['unent'] = [(-3, 7), (-2, 2), (-2, 2)]

sample_points_boundary = {}

for config in ['ent', 'unent']:
    gradient_flow_db_circle2 = GradientFlow(tori_detect_nn[config], boundary_tuple=boundary_tuple[config])
    sample_points_boundary[config] = gradient_flow_db_circle2()

In the last step we plot the data set and the computed decision boundary

In [9]:
for config in ['ent', 'unent']:
    plot_decision_boundary(tori_point_cloud[config], tori_labels[config], sample_points_boundary[config], n_components=3)

TODO: Density of decision boundary points¶

too be written

https://docs.scipy.org/doc/scipy/reference/generated/scipy.spatial.distance_matrix.html

In [ ]: